# A tibble: 2 × 1
`PCOS dimensions`
<int>
1 541
2 44
# A tibble: 2 × 2
PCOS_diagnosis n
<chr> <int>
1 No 364
2 Yes 177
Polycystic ovary syndrome (PCOS) is a syndrome documented in women in their menustruating ages
Documented symptoms are often; period pains, irregular periods, ovary related problems and hormone imbalance
Patients with PCOS often have problems with pregnancy and potential complication with/in pregnancy
However, it is still not verified what the cause of PCOS is.
The aim of this study is to examine a data set (found on Kaggle) of patients with and without PCOS. The data set has been made in India and data comes from 10 different hospitals.
Unit changes ( inch to cm)
Rounding BMI
Grouping & BMI
Change Blood type and cycles from numeric values to characters
Create new column for cycle/ pregnancy stage
Merging data frame into one file
# Rounding of BMI and dividing into categories
body_measurements <- body_measurements |>
mutate(BMI = round(BMI, 1)) |>
mutate(BMI_class = case_when(
BMI < 18.5 ~ "Underweight",
BMI <= 18.5 | BMI < 25 ~ "Normal weight",
BMI <= 25 | BMI < 30 ~ "Overweight",
BMI >= 30 ~ "Obesity")) |>
mutate(BMI_class = factor(BMI_class,
levels = c("Underweight",
"Normal weight",
"Overweight",
"Obesity"))) |>
relocate(BMI_class, .after = BMI)| 02 Clean data | 03 Augment data |
|---|---|
|
|
# A tibble: 2 × 1
`PCOS dimensions`
<int>
1 541
2 44
# A tibble: 2 × 2
PCOS_diagnosis n
<chr> <int>
1 No 364
2 Yes 177
hh
hh
her
her
her